Difference of Normals as a Multi-Scale Operator in Unorganized Point Clouds 
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Abstract 

A novel multi-scale operator for unorganized 3D point 
clouds is introduced. The Difference of Normals (DoN) pro- 
vides a computationally efficient, multi-scale approach to 
processing large unorganized 3D point clouds. The appli- 
cation of DoN in the multi-scale filtering of two different 
real-world outdoor urban LIDAR scene datasets is quanti- 
tatively and qualitatively demonstrated. In both datasets the 
DoN operator is shown to segment large 3D point clouds 
into scale-salient clusters, such as cars, people, and lamp 
posts towards applications in semi-automatic annotation, 
and as a pre-processing step in automatic object recog- 
nition. The application of the operator to segmentation 
is evaluated on a large public dataset of outdoor LIDAR 
scenes with ground truth annotations. 

1. Introduction 

1.1. Motivation 

The increasing prevalence of 3D scanners has resulted 
in a dramatic explosion in the availability of 3D data, es- 
pecially raw sensor data often represented in the most ba- 
sic 3D point cloud format. Such sensors include LIDAR 
scanners for modelling large outdoor scenes and GIS ap- 
plications, as well as commercially available and inexpen- 
sive solutions for indoor scanning and modelling. Conse- 
quently the processing of point clouds of millions, or even 
hundreds of millions of points has become commonplace. 
Furthermore, new applications of range sensors that require 
the processing of large point clouds in real-time, such as 
self-driving cars [ ], have arose. 
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For such datasets and their applications to be useful, or 
even feasible, there is a demand for salient point selection 
algorithms based solely on an unorganized point cloud - 
as opposed to connected-graph and mesh-based algorithms 
which are typically more computationally and memory in- 
tensive. This has motivated the move towards using simple 
point cloud processing algorithms to filter a point cloud for 
salient points before applying complex algorithms, akin to 
the common usage of image processing filters in 2D com- 
puter vision algorithms. 

One such image processing filter is the Difference of 
Gaussians (DoG). The DoG is an approximation to the 
Laplacian of the Gaussian (LoG) operator, and is widely 
used in applications such as image enhancement, blob de- 
tection, edge detection, finding points of salience, pre- 
segmenting images [13], and perhaps most notably in the 
form of a DoG pyramid for obtaining scale invariance in 2D 
object recognition [12]. Although the DoG operator easily 
generalizes to so-called 2.5D data (i.e. depth images) and 
volumetric images (e.g. in medical imaging) [22], extend- 
ing it to unorganized data (i.e. point clouds), particularly in 
a computationally efficient manner, is less straight forward. 

1.2. Contributions 

In this paper, a multi-scale operator of similar function 
to the DoG is introduced for unorganized 3D point clouds, 
namely the Difference of Normals (DoN). Despite the sim- 
plicity and efficiency of the operator, DoN is shown to be 
surprisingly powerful in assigning point saliency according 
to scale. While DoN is motivated in this paper as a multi- 
scale saliency feature used in a segmentation and/or object 
recognition pipeline, it also has applications to oriented 3D 
edge detection and planar region segmentation. An open 
source implementation of the DoN operator is made avail- 



able in the Point Cloud Library (PCL) [15] ^ 

1.3. Organization 

The rest of this paper is organized as follows: In §2 
previous work on multi- scale operators and segmentation 
in unorganized point clouds is summarized; §3 introduces 
the Difference of Normals (DoN) operator; §4 introduces a 
method for parameter selection, and shows the application 
of the DoN operator to the segmentation of real urban LI- 
DAR scenes. A quantitative analysis of DoN segmentations 
from a publicly available dataset with ground truth annota- 
tions is evaluated; §5 summarizes the results and explores 
the potential for future work. 

2. Previous Work 

2.1. Scale and Unorganized Point Clouds 

In 2D images, the concept of scale space is often de- 
scribed with a family of gradually smoothed images created 
through the convolution of a Gaussian kernel, such a Gaus- 
sian scale- space has a wide range of applications in image 
processing and 2D computer vision, such as in edge sharp- 
ening and interest point selection [12]. 

Extending the concept of scale-space to unorganized 3D 
point clouds is a non-trivial task due to the lack of a regular 
lattice from which points were sampled. One solution is to 
convert the data into an organized format, such as a dense 
voxel map, in which the generalization from 2D image pro- 
cessing is straightforward. This however, is an unrealistic 
task for large point clouds, such as outdoor scenes with mil- 
lions of points, since the required number of voxels will be 
vast. Octree representation is a possible solution in reducing 
the memory requirements, but it comes at a computational 
cost. 

Unnikrishnan et al., arguing the need for a multi-scale 
unorganized point cloud operator, introduced such an op- 
erator derived from Laplace-Beltrami operator - a gener- 
alization of the Laplacian to Riemannian manifolds [19]. 
Their proposal of a multi-scale unorganized point cloud op- 
erator was derived in a scale-space theoretic method, and 
being based on a Gaussian convolution kernel satisfies the 
scale space axioms. In application, however, the operator is 
relatively computationally and memory intensive compared 
with our proposed method, as it requires the computation of 
a geodesic distance graph for the point cloud and the con- 
volution kernel is computationally expensive to compute. 
Furthermore it is unclear how the operator could be used to 
detect oriented features such as corners or edges. 

2.2. Normal Support Radius in 3D Point Features 

Many proposed features for unorganized point clouds 
have directly, or indirectly used the relation of support re- 

1 Available as a feature in the PCL trunk: http://www.pointclouds.org. 



gion size in surface normal estimation as a method of scale 
or saliency detection on the implicit surfaces of a 3D point 
clouds. 

Rusu et al. proposed Persistent Point Feature Histograms 
for unorganized point clouds. These features were calcu- 
lated in part using a series of normals estimated with a se- 
ries of increasing radii between a fixed minimum and maxi- 
mum radius [ ]. Novatnack and Nishino used surface nor- 
mals to create scale-dependent geometric features on trian- 
gular meshes. The mesh normals were interpolated over a 
2D parametrization of the mesh over each vertex, creating 
a normal map that was used in edge detection. They argued 
that while a normal map does not satisfy the axioms of a 
scale space operator, it is a natural choice as normals are 
less effected by noise as compared to higher order deriva- 
tive quantities such as curvature. 

2.3. Point Cloud Segmentation 

Various algorithms have been proposed in the area of 
point cloud segmentation. However, most algorithms for 
unorganized point clouds have require meshing or connec- 
tivity [4, 7]. Those that do not often require estimating the 
normal map as an integral step. 

Liu et al. introduced Cell Mean Shift (CMS) [11], which 
maps the normal map of a point cloud to a Gaussian sphere, 
producing a Gaussian image. This spherical image can then 
be clustered to identify shapes. Woo et al. [21] propose an 
octree-based method for handling large point clouds, using 
edges to segment structures within. 

3. Difference of Normals Operator 
3.1. Theory 

The concept of scale- space has a well established theo- 
retical background for continuous and discrete signals, no- 
tably in linking the relationship between scale- space and the 
linear diffusion equation [ ] and in establishing the scale 
space axioms [8, 20]. A set of axioms [8, 10, 20], the com- 
plete review of which is beyond the scope of this paper, are 
described to capture the properties of a desired and useful 
scale-space representation. Notably, it has been proven that 
the Gaussian kernel is the only convolutional filter that sat- 
isfies the complete set of scale-space axioms [8]. 

Although the Gaussian kernel is unique in satisfying the 
complete set of scale- space axioms, for many applications 
an operator that satisfies the full set of scale-space axioms is 
not required. Such approaches are more generally referred 
to as multi- scale. Multi- scale operators may be simpler, 
more computationally efficient and have desirable proper- 
ties such as orientability. 

This paper proposes to define a multi- scale operator for 
unorganized point clouds directly using the estimated sur- 
face normal map of an unorganized point cloud. The pri- 



mary motivation behind this, is the observation that surface 
normals estimated at any given radius reflect the underly- 
ing geometry of the surface at the scale of the support ra- 
dius. Although there are many different methods of esti- 
mating the surface normals (see §3.3), normals are always 
estimated with a support radius (or via a fixed number of 
neighbours). This support radius determines the scale in the 
surface structure which the normal represents. 
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Figure 1: The normal support radius' relation to scale. 

Fig. 1 illustrates this effect in ID. Normals, n, and tan- 
gents, T, estimated with a small support radius r s are af- 
fected by small-scale surface structure (and similarly by 
noise). On the other hand, normals and tangent planes es- 
timated with a large support radius r\ are less affected by 
small-scale structure, and represent the geometry of larger 
scale surface structures. 

The intuition behind the approach being proposed here 
is that if the direction of the two surface normals is nearly 
identical, then the structure of the surface does not change 
significantly from the first radius to the second. By con- 
trast, if the structure of the larger neighbourhood around a 
center point is significantly different from that of the smaller 
neighborhood, then the direction of the two estimated nor- 
mals are likely to vary by a larger margin. In that case, a 
value between the two radii is often a representative of the 
scale around near the center point. 

Suppose a multi-scale operator for a point cloud is sim- 
ply defined as: 

L(p,r) = n(p,r), (1) 

with scale parameter r, effected by the normal map of a 
point cloud P estimated with support radius r. Notice the 
response of our operator is a vector, and is thus orientable, 
however the operator's I2 norm provides a more conven- 
tional scale quantity. 

Just as described by the most basic and intuitive scale 
space axiom, the effect of the normals on the implicit sur- 
face sampled by a point cloud is to suppress most of the 
structures in the surface with a characteristic dimension of 
less than r. Furthermore, with increasing values of the scale 
parameter r, fine scale surface structure is increasingly sup- 
pressed. Despite this, Eqn. 1 does not satisfy all scale space 
axioms originally outlined by Witkin et al. [ ] and more 
recently enumerated by Lindeburg et al. [10]. Notably the 
causality requirement introduced by Koenderink et al. [8]. 



3.2. Method 

When applying the multi-scale operator defined in 
Eqn. 1 , we compare the responses at each point p over sev- 
eral radii r\ < T2 < ... < r n . In the most basic case we 
can compare the response of the operator across two dif- 
ferent radii r\ < T2. Formally, the Difference of Normals 
(DoN) operator A^ for any point p in a point cloud P, is 
defined as: 



where 7*1, G M, v\ < r2, and n(p, r) is the surface nor- 
mal estimate at point p, given the support radius r. 

For a given n and r2 , the result of applying the A^ oper- 
ator to all the points in a point cloud is a vector map where a 
DoN vector is assigned to each point. Since each DoN is the 
normalized sum of two unit normal vectors, the magnitude 
of the An vectors are always within [0,1]. 

The DoN vectors may be thresholded based on their 
magnitude, i.e. || A n (p) || , or based on their component val- 
ues, i.e. A n (p), A n (p), or A n (p) for orientable surfaces 
and edges. 

Calculating the two normal maps estimated with support 
radii r\ , T2 for a scene is a process which is highly paral- 
lelizable and thus greatly benefits from GPU optimization. 
Consequently, DoN computation, even for very large scale 
point clouds, may be performed very efficiently (see §4.4). 

3.3. Approximating Normals in Range Data 

3.3.1 Normals Estimation 

There are many methods for estimating normals (or equiv- 
alently tangent planes) in point clouds [1, 5, 6]. However, 
only those using a fixed support radius, rather than a fixed 
number of neighbors, are suitable for unorganized data, es- 
pecially when the point cloud density is highly variable. 

Applying a method based on a fixed number of neighbors 
to a point cloud with a high variability in sampling density, 
e.g. urban LIDAR data, results in each normal being com- 
puted using what may be a very different support radii, and 
thus the estimated normals at each point will represent the 
surface at very different scales. Such normals would be un- 
suitable for DoN calculations. 

In our experiments, the normals were estimated by find- 
ing the tangent plane using the principal components of 
a local neighborhood of fixed support radius around each 
point. This neighborhood may contain any number of points 
N > 3. The result is that all the normals in the scene are 
calculated at the same scale. However, due to the highly 
variable sampling density/resolution of some range data, 
the accuracy of the normal estimate, may vary considerably 
across a scene with N. 

It is important to note that PC A is not robust to outliers, 
and for some applications more robust methods of normal 



estimation may be more suitable [ ], however in our ex- 
periments we found PCA estimated normals to be sufficient 
even in the presence of highly unorganized data. 



3.3.2 Resolving Normal Ambiguity 

Surface normals estimated on point clouds exhibit a sign 
ambiguity in their direction. This is because any tangent 
plane to a point has two normals in opposite directions, ei- 
ther of which is mathematically valid. In many applications, 
this normal ambiguity is typically resolved with the sensor 
context, since the correct normal is always the one point- 
ing in the hemisphere towards the range sensor [ ]. For 
the particular application of DoN operations, the particular 
choice of resolving the normal sign ambiguity has no conse- 
quence so long as the normals for the two support radii are 
disambiguated in the same manner. Thus the disambigua- 
tion of the normals can simply be achieved by negating one 
of the normals if n(p, 7*1) • n(p, r2) > f , i.e. if the angle 
between the two normals is greater than 90°. This is un- 
der the assumption that the true surface normals must be 
within an angle of | of each other, which is a realistic as- 
sumption given the limitation of scanning in the presence of 
self-occlusion. 

4. Experimental Results 

4.1. Parameter Selection 

Selecting the parameters 7*1 and r2 for DoN is impor- 
tant since, while a wide range of parameters may elicit 
large responses from the surface of interest, naive param- 
eters choices may also have large responses in other classes 
of surfaces. We propose a simple parameter selection al- 
gorithm where the objective is to choose parameters maxi- 
mizing the DoN magnitude for the set of points within the 
objective class, while minimizing the DoN magnitude for 
other known classes of surfaces/objects in the scene. 

In practice, given a set of ground truth point clouds for 
our objective object (e.g. cars) and a set of ground truth 
point clouds for the objects in close vicinity of the objective 
object (e.g. road, people), we compare the aggregate re- 
sponse statistics (i.e. median, mean, variance) for all points 
in each class across a selection of DoN parameters. 

Fig. 2 shows the mean, median and variance responses 
for a set of object classes from a single data sequence in the 
KITTI dataset [ ] over a range of parameters r s , 77 • Using 
this, for example, we empirically set the parameters r s = 
0.1, 77 = 0.4 for pedestrians and r s = 0.4, T\ = 2.0 for cars 
in order to maximize the intra-class response distance in a 
scene containing both objects in close proximity. 
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(a) Car aggregate statistics 
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(b) Pedestrian aggregate statistics 

Figure 2: Aggregate per-object class statistics used in pa- 
rameter selection. 

4.2. TITAN Urban Mobile LIDAR Data 

In GIS applications, there is a focus on the recognition 
of street furniture (a GIS term describing lamp posts, fire 
hydrants, curbs, etc.) and the extraction of large-scale in- 
frastructure (e.g. buildings, roads). The DoN operator is 
an ideal tool for addressing such problems by isolating ob- 
jects in the scene based on their scale. The following results 
demonstrate various applications of DoN to real- world data, 
towards automatic segmentation and as a pre-processing 
step in object recognition or annotation in urban LIDAR 
data. 

The points clouds used in the experiments reported here 
are from a TITAN system [ ] . The raw data is collected via 
a LIDAR scanner mounted on a moving vehicle and scan- 
ning the urban scenery as the vehicle traverses the street. 
The individual scans are then registered together to form 
complete 3D point clouds of large urban areas. As the reg- 
istration of many low (~ 0.1 cm) resolution scans from a 
mobile platform, the final point clouds have a highly vari- 
able sampling density governed mostly by the speed of the 
vehicle, the changing obstructions in the scene, and the reg- 
istration error. Despite the large amount of data in a typi- 
cal TITAN scene, individual objects are often composed of 
small numbers of points, e.g. ~ 100 points in the case of a 
person. All these factors make processing the TITAN point 
clouds a particularly challenging domain. 

For illustration purposes, the results in this section are 
demonstrated using small (25 m 3 ) sections of a real- world 
urban LIDAR data in the city of Kingston, ON, Canada, 
collected by the TITAN mobile terrestrial scanner. Similar 



results were also observed on much larger datasets of hun- 
dreds of millions of points. 

4.2.1 DoN Features 





(a) Point cloud (478,377 points). (b) | A fi (0.2m, 2 m) 
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Figure 3: DoN magnitude results on 'Bagot St, Kingston, 
ON, Canada'. 

The DoN operator has two parameters, a large radius (Y2) 
and a small radius (7*1). While each structure may exhibit a 
response in a range of scales, it will generally have a natu- 
ral scale at which this response is maximized. Empirically, 
it was found that thresholding the magnitude of the A^(p) 
vectors obtained with scale ratios (7*2/7*1) of 10 provided 
good results for filtering out large points belonging to large 
scale planar surfaces. Fig. 3a illustrates a typical urban LI- 
DAR scene, for which the magnitude of the DoN vectors for 
each scene points (i.e. ||Afi(p)||) at three different scales 
are shown in Figs. 3b, 3c, and 3d. The magnitudes, which 
are in the [0, 1] range, are colorized according to the color 
map shown at the bottom of the image. 

For DoN parameters corresponding to small scales (e.g. 
within the 0.2 — 2 m range), points belonging to lower scale 
objects have strong responses. For example in Fig. 3b, the 
finest scale structure exhibits the strongest response. These 
include road curbs, window ledges, and the details in build- 
ing facades. For DoN parameters corresponding to larger 
scales (i.e. 2 — 20 m), points belonging to larger structures 
have strong responses. For example in Fig. 3d the building 
points have a large response, yet very large scale structures 
(i.e. the road surface) still exhibits a small response. 

4.2.2 DoN Scale-Based Filtering 

An important application of the DoN operator, as motivated 
by the results shown in Fig. 3, is to use it as a salience op- 




(a) Original: 614403 points. (b) | A ft (0.1 m, 1 m)| > 0.25: 
135518 points. 
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(c) |Aft(0.2m,2m)| > 0.25:(d) |A fi (0.8m, 8m)| > 0.25: 
132708 points. 139367 points. 

Figure 4: DoN filtering results on 'Intersection of Clergy 
and Johnson St, Kingston, ON, Canada' . 



erator to pre-filter point clouds. Fig. 4 shows the results of 
such a filtering of a point cloud, discarding all the points 
for which || A^(p)|| < 0.25, on a typical urban scene, with 
various DoN parameters corresponding to a range of scales. 

At the lowest scale (0.1 — 1.0 m), shown in Fig. 4b, sharp 
edges are clearly preserved, including building edges (win- 
dow outlines and pipes) and ground edges (street curbs). 
Also preserved, however, is artificial structure derived from 
noise - to be expected at approximately the resolution of the 
data. By the next, incrementally larger, scale (0.2 — 2.0 m), 
shown in Fig. 4c, the noise has been filtered out. As the 
scale is increased, larger and larger objects are preserved, 
while smaller objects are increasingly discarded. At the 
largest scale (0.8 — 8.0 m), shown in Fig. 4d, larger building 
fronts and walls are segmented from the rest of the scene. 

4.2.3 Segmentation 

DoN filtering of a point cloud, such as that described in 
§4.2.2, was found to result in good isolation of points in ur- 
ban LIDAR scenes. Applying a simple clustering method 
to the resulting point cloud, results in the clear clustering 
of many objects of interest in a scene. A simple Euclidean 
distance threshold based clustering algorithm, (Euclidean 
Cluster Extraction [14]), was applied with a distance toler- 
ance of ri, a minimum of 100 cluster points, and a max- 
imum of 100, 000 cluster points. Fig. 5 shows the results 
of such clustering to DoN filtered scenes of various DoN 
parameters. Each cluster in the scene is assigned a random 
(non-unique) color. Figs 5c-5g illustrate various clusters in 
the scene, corresponding to various objects including a per- 
son, a traffic light fixture, a window, a car, and a tree. 



(a) Original point cloud (620,820 points). 



(b) Clusters found in | A fi (0.2m, 2m)| > 0.25. 



(c) Person cluster. (d) Traffic light cluster. (e) Window cluster. (f) Car cluster. (g) Tree cluster. 

Figure 5: DoN clustering results for 'Intersection of Princess and Bagot St, Kingston, ON, Canada' and sample clusters from. 



Such segmentation might form a fundamental pre- 
processing step in an object recognition pipeline for find- 
ing objects in an urban LIDAR scene [ ]. The pipeline 
would include the described clustering method, followed by 
feeding individual clusters into an object recognition algo- 
rithm. Since the the clustering algorithm isolates individual 
objects (such as cars, people, fire hydrants, etc), DoN clus- 
tering enables the use of global object recognition methods 
that require pre- segmentation [17]. 

4.3. KITTI Vision Benchmark Suite 
4.3.1 KITTI Dataset 



Although the TITAN Urban Mobile LIDAR Data appli- 
cation was motivational in the initial development of the 
proposed method, it is not a public dataset and does not 
have readily available ground truth. Thus for a repeatable 
quantitative evaluation, we used the KITTI Vision Bench- 
mark Suite [ ] . The KITTI dataset includes a large number 
of point clouds along with annotated ground truth bound- 
ing boxes for objects of interest to driving and navigation. 
Each sequence consists of a number of frames, where each 
frame has an inertially corrected 3D Velodyne point cloud 
(~100k points per frame), and manually annotated 3D ob- 
ject bounding boxes for cars, trucks, trams, pedestrians, and 
cyclists. 

Although the KITTI data also consists of unorganized 
point clouds, it is far sparser than the TITAN point clouds, 
and was captured with a single 360 ° sensor rather than an 
array of line scanners. Fig. 6 illustrates a sample Velodyne 
point cloud from a single frame. 



4.3.2 Method 

In order to evaluate DoN based segmentation, as illustrated 
in §4.2.3, a set of DoN parameters (ri, r2) and DoN magni- 
tude thresholds t were chosen based on the parameter selec- 
tion algorithm outlined in §4.1. As in the method outlined 
through sections §4.2.1 through §4.2.3, the DoN was calcu- 
lated for a sequence of frames (point clouds) after which the 
Figure 6: A single frame KITTI Velodyne point cloud. DoN magnitude was thresholded by a fixed value (t = 0.25) 



and Euclidean Cluster Extraction [15] was performed with 
a distance threshold equivalent to the smallest DoN radius 
7*1 and a set of clusters extracted. 

For each frame, the set of clusters was compared with 
the each of the ground truth bounding box labels to identify 
the cluster with highest intersection. This candidate cluster 
was then compared with the ground truth point cloud by 
collecting various statistics. 

The main measures used to evaluate quality of the seg- 
mentation were precision (ratio of correctly predicted ob- 
ject points to the total number of predicted object points), 
and recall (ratio of correctly predicted object points to the 
number of ground truth object points). 

Due to the nature of the Velodyne data, in many frames 
the point clouds within ground truth bounding boxes may 
consist of very few points (< 100). It was judged that such 
extremely sparse ground truth objects were unsuitable for 
evaluating the segmentation of smaller scale objects, and 
since our clustering algorithm's minimum threshold was 
100 points, for all of the object classes a minimum of 100 
points was required for a ground truth point cloud to be used 
in evaluation. 

4.3.3 KITTI Results 

Fig. 7 illustrates the results of our evaluation in the 
form of a precision/recall graph over thousands of ground 
truth objects on two different sequences in the KITTI 
dataset, 2011_0 9_2 6_drive_00 01 (Fig. 7b, 7c) and 
2 01 1_0 9_2 6_dr ive_0 9 (Fig. 7d). Each data point 
is of a size proportional to it's ground truth point cloud's 
size (note: scale is not preserved inter-class). 

The majority of the results have a precision > 0.9. How- 
ever the recall values depend more on the class of object and 
DoN parameters. Smaller scale objects, such as pedestri- 
ans and cyclists have higher recall/precision for the smaller 
parameters of r\,ri as can be seen in Fig. 7b, 7c. While 
larger scale objects such as cars and vans have higher re- 
call/precision for larger radii, as can be seen in Fig. 7d. 

While is is difficult to compare the performance of al- 
gorithms evaluated on different tests sets (we advocate 
the usage of the public KITTI dataset), the results appear 
favourable in comparison with the recall/precision of more 
computationally intensive mesh and graph-based segmenta- 
tion methods evaluated on less challenging datasets [4]. 

4.4. Computational Efficiency 

The computation of DoN on a scene requires the cal- 
culation of the normal maps and is bound by the nearest 
neighbor radius search for the largest radius parameter 
In practice, for large radii, this can involve calculating the 
normal to a point using hundreds of thousands of points. 
Instead an approximation can be calculated by uniformly 
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(b) Clusters from A^(0.1, 1.4) > 0.25 with threshold 0.2 
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(d) Clusters from A^(l, 3) > 0.25 with threshold 1.0 
Figure 7: Results of DoN clustering v.s. ground truth. 

sub-sampling the point cloud used for the nearest neigh- 
bor search. The current implementation of DoN can do 
this given a decimation parameter d, where the search point 
cloud is sub-sampled using a uniform re-sampling algo- 
rithm, with the point cloud coarsely voxelized into voxels 
of length ri jd for the small radius normals and T2 jd for the 
large radius normals. It was found that an approximation 



with d = 10 results in negligible error, while halving the 
time for calculating DoN with n = 0.1, r 2 = 1.0 on the 
point cloud in Fig. 3 with 478,348 points to 3454.33 ms 
compared with 7812.5 ms for the full calculation on a 
3.2Ghz i7. A preliminary GPU implementation of DoN was 
found to be an order of magnitude faster, taking only 565.46 
ms on an NVIDIA GTX 480 for the full computation. 

5. Conclusion and Future Work 

The Difference of Normals as a multi-scale operator was 
introduced for unorganized 3D point clouds. Illustrated re- 
sults on dense urban LIDAR data qualitatively showcased 
the effectiveness of DoN filtering in keeping points belong- 
ing to objects at a given scale, while discarding those be- 
longing to structures of other scales. The application of 
DoN as a scale-base filtering and segmentation tool was 
highlighted in urban LIDAR scenes. Results on a typical 
urban street intersection with clustering showed a clear seg- 
mentation of points belonging to various objects of inter- 
est at different scales, such as cars, road curbs, trees, and 
buildings - some having as few as 100 points. With urban 
LIDAR scenes typically containing millions of points, DoN 
filtering provides a substantial reduction in points for per- 
forming any further processing of the scene. 

The quality of DoN-based segmentation was quantita- 
tively evaluated on a large, publicly available dataset of 
sparse, unorganized urban LIDAR data. Objects such as 
cars and pedestrians were automatically segmented from the 
scene and compared with ground truth annotations. 

Future work includes the development of a DoN based 
surface descriptor to exploit the defined scale operator over 
several radii, and integration with object recognition meth- 
ods. The development of an interactive semi- automated 
tool for annotating large scale 3D point clouds in particu- 
lar would go a long way towards simplifying the generation 
of GIS models from urban LIDAR point clouds. 
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